Accuracy in Pharmacoeconomic Literature Review: Lessons Learned From the Navajo Code Talkers

The Navajo Code Talkers, who faithfully transmitted messages in an indecipherable code to support U.S. military efforts from 1942 to 1945, are widely credited with the achievement of numerous victories in the Pacific theater during World War II (WWII), among them the capture of Iwo Jima by the Marine Corps in 1945.1-3 Originally a highly classified military secret, the Code Talkers’ work was completely unknown to the American public—and even to the Code Talkers’ own families—until its declassification in 1968.4 Official recognition of the Code Talkers’ accomplishments did not come until 1992, a half-century after their work began.2 The hallmark of the Code Talkers’ work was a rapid but exacting translation process. Receiving messages in “a string of seemingly unrelated Navajo words,” each Code Talker first translated the words to English, then used the first letter of each English word to spell out an English message.2 In addition, both for security purposes and to speed translation, the Code Talkers memorized a dictionary of an astonishing 450 Navajo words representing commonly used military terms; for example, the word “besh-lo” (iron fish) referred to a submarine, “dah-he-tih-hi” (hummingbird) was a fighter plane, and “debeh-li-zine” (black street) indicated a squad.2,3 Because of the innate complexity of the Navajo language, the use of multiple Navajo words to represent a single English letter, and the detailed nature of the translation process, the code was both impossible for anyone else to decipher and painstakingly difficult for the Code Talkers themselves. “When we were in the [training] classroom we were drilled and drilled,” one Code Talker recalled in a 2005 interview. “No writing it down. It was all memorized. ... At the end of the class you had to hand in every pencil and piece of paper.”1 Yet, despite the difficulty of the work and the pressures of performing it during combat, the Code Talkers consistently turned in error-free performances. During the first 2 days of the battle of Iwo Jima, 6 Code Talkers worked around the clock to transmit 800 messages, all with 100% accuracy.2 With some dismay, JMCP editors have recently noted that an increasing number of literature reviews, both in published work and in manuscripts submitted to us, bear little or no resemblance to the careful translations that characterized the heroic cryptographers of WWII. It is common for us to read in a submitted manuscript a statement that “in Disease A, Drug X is widely accepted as more efficacious than Drug Y,” only to find in our research that the source cited for the statement investigated a different disease state, studied only a handful of people, produced a finding that Drug X was not superior to Drug Y, or otherwise did not support the alleged claim. A case in point is found in examination of the literature published in the years following Soumerai et al.’s 1991 often-cited study of the effect of a medication coverage limit on a sample of chronically ill elderly Medicaid enrollees in New Hampshire.5 A comparison of the original study report with descriptions of the study that were provided in later manuscripts provides a revealing and at times disturbing look at the uses—and abuses—of medical literature applied in the service of making a point. This editorial reviews a sample of those descriptions and suggests a direction for the future.

T he Navajo Code Talkers, who faithfully transmitted messages in an indecipherable code to support U.S. military efforts from 1942 to 1945, are widely credited with the achievement of numerous victories in the Pacific theater during World War II (WWII), among them the capture of Iwo Jima by the Marine Corps in 1945. [1][2][3] Originally a highly classified military secret, the Code Talkers' work was completely unknown to the American public-and even to the Code Talkers' own families-until its declassification in 1968. 4 Official recognition of the Code Talkers' accomplishments did not come until 1992, a half-century after their work began. 2 The hallmark of the Code Talkers' work was a rapid but exacting translation process. Receiving messages in "a string of seemingly unrelated Navajo words," each Code Talker first translated the words to English, then used the first letter of each English word to spell out an English message. 2 In addition, both for security purposes and to speed translation, the Code Talkers memorized a dictionary of an astonishing 450 Navajo words representing commonly used military terms; for example, the word "besh-lo" (iron fish) referred to a submarine, "dah-he-tih-hi" (hummingbird) was a fighter plane, and "debeh-li-zine" (black street) indicated a squad. 2,3 Because of the innate complexity of the Navajo language, the use of multiple Navajo words to represent a single English letter, and the detailed nature of the translation process, the code was both impossible for anyone else to decipher and painstakingly difficult for the Code Talkers themselves. "When we were in the [training] classroom we were drilled and drilled," one Code Talker recalled in a 2005 interview. "No writing it down. It was all memorized. … At the end of the class you had to hand in every pencil and piece of paper." 1 Yet, despite the difficulty of the work and the pressures of performing it during combat, the Code Talkers consistently turned in error-free performances. During the first 2 days of the battle of Iwo Jima, 6 Code Talkers worked around the clock to transmit 800 messages, all with 100% accuracy. 2 With some dismay, JMCP editors have recently noted that an increasing number of literature reviews, both in published work and in manuscripts submitted to us, bear little or no resemblance to the careful translations that characterized the heroic cryptographers of WWII. It is common for us to read in a submitted manuscript a statement that "in Disease A, Drug X is widely accepted as more efficacious than Drug Y," only to find in our research that the source cited for the statement investigated a dif-ferent disease state, studied only a handful of people, produced a finding that Drug X was not superior to Drug Y, or otherwise did not support the alleged claim.
A case in point is found in examination of the literature published in the years following Soumerai et al.'s 1991 often-cited study of the effect of a medication coverage limit on a sample of chronically ill elderly Medicaid enrollees in New Hampshire. 5 A comparison of the original study report with descriptions of the study that were provided in later manuscripts provides a revealing and at times disturbing look at the uses-and abuses-of medical literature applied in the service of making a point. This editorial reviews a sample of those descriptions and suggests a direction for the future.

The Original Study: Medicaid Prescription Drug Restrictions and Use of Hospital and Nursing Home Services
Soumerai et al.'s 1991 study assessed the relationship between a prescription drug "cap," a limit of 3 filled prescriptions per month per Medicaid recipient, and admissions to nursing homes and inpatient hospitals. The cap had been implemented in the New Hampshire Medicaid program in September 1981 and replaced 11 months later, in August 1982, with a $1 prescription drug copayment. 5,6 The study authors used a quasi-experimental, time series with comparison group design to assess outcomes for a subset of Medicaid enrollees (n = 411) who met clinical and demographic criteria indicating older age and chronic illness ( Table 1). The comparison group (n = 1,375), which was drawn from the New Jersey Medicaid population, met the same inclusion criteria but had not been subject to any prescription drug cost-sharing or coverage limitations.
Outcomes, including medication use and admissions to inpatient hospitals and nursing homes, were measured during a 5-month pre-implementation period, the 11-month cap period, and for 11 months after the replacement of the cap with the copayment. Measured in "standardized monthly doses" (SMDs) per patient per month (PPPM), defined as "the median number of milligrams of active ingredient per month received by all the patients who filed a claim for each study drug," medication use rates for the New Hampshire cohort declined from a preimplementation rate of 2.8 SMDs PPPM to a cap period rate of 1.9 SMDs PPPM. Time-series analysis indicated that the change represented a 35% drop. By the end of the 11-month cap period, nursing home admission rates were 10.6% and 6.6% for the New Hampshire and New Jersey cohorts, respectively. The relative risk (RR) of nursing home admission for the sample overall was 1.8 (95% confidence interval [CI] = 1.2-2.6).
The results most frequently reported for the study are derived from a subsample of study patients (48% of 411 in New Hampshire and 55% of 1,375 in New Jersey) who had at least 8 claims per year for at least 3 chronic medication classes (including the "core" therapeutic classes used in the sampling process plus 21 therapeutic classes for treatment of cardiovascular disease, diabetes, psychiatric disorders, pain, and other conditions). For this subsample, nursing home admission rates were 14.4% for New Hampshire and 6.2% for New Jersey; the reported RR of nursing home admission for New Hampshire was 2.2 (95% CI = 1.2-4.1). 5 However, among those who did not meet the subsample criteria, the relationship between the cap and nursing home admission was not statistically significant. 5 Moreover, even among those who did meet the subsample criteria, rates of hospital admission for New Hampshire and New Jersey Medicaid recipients during the cap period did not significantly differ (RR = 1.2, 95% CI = 0.8-1.6). 5 Attributing their findings to either "declining health" because of "loss of medications" or "financial reasons" arising from additional medication expense, Soumerai et al. concluded that their findings "raise questions about the clinical and economic wisdom" of "limits on drug reimbursement" in state Medicaid programs. 5 Shortly after the study's publication, a commentary by Schulz and Lewis raised several concerns about its conclusions, calling them "tentative at best." 7 One of the most compelling of Schulz and Lewis's stated reasons for skepticism about the study's findings was the lack of relationship between the cap and hospital admission. If the cap had resulted in deteriorating health status, Schulz and Lewis pointed out, one would logically expect both hospital and nursing home use to be affected. Schulz and Lewis also appropriately questioned the logic of Soumerai et al.'s alternative explanation for their findings-that community-dwelling elderly had entered nursing homes in order to avoid paying outof-pocket for medications. Schulz and Lewis concluded with the important prediction that, although the sampling criteria used by Soumerai et al. were so restrictive that their results were "not representative of a sizeable group of the population," the "appeal of the authors' conclusions to some [health care] constituencies" would raise "the danger that broad generalizations will be voiced, and policy decisions may be influenced, by a study whose conclusions should be interpreted cautiously." 7

Descriptions of the Soumerai et al. Study in the Research Literature: 1991-2001
Schulz and Lewis's expectation that Soumerai et al.'s findings would have broad appeal was certainly realized in the years following the study's publication. To date, more than 300 published papers have cited the work. 8 Consistent with Schulz and Lewis's concerns, examination of a convenience sample of these publications reveals numerous serious errors in describing the study's results ( Table 2). [9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24] The most common and serious error was attributing inpatient hospitalizations to the cap although Soumerai et al. found no significant relationship between the cap and hospital admission rates. 9,[11][12][13]19,21,23 Additional errors occurred when describing the cap itself, which was commonly represented as a formulary. 12,16,[19][20][21] Similarly, one description of the paper attributed negative outcomes to copayments despite Soumerai et al.'s finding that the risk of nursing home admission returned to its baseline (pre-cap) level when the cap was replaced by a $1 copayment. 5,15 Notably, some discussions of the work cited outcomes that were not even studied by Soumerai et al.; these included changes in physician practice patterns, prescribing rates, emergency room use, and even mortality. 12,14,18,22,24 More recently, an erroneous description of the study appeared in a 2007 publication (not shown in Table 2), which stated that the cap "was found to be associated with increased nursing home admissions 10 years after implementation." 25 The statement implied that the risk of nursing home admission continued to be elevated 10 years after cap implementation; in fact, the original study report indicated that the nursing home admission rate reverted to pre-cap levels upon replacement of the 3-prescription cap with the $1 copayment.
As Schulz and Lewis predicted in 1992, a particularly common error, appearing in numerous publications not shown in the table, was failing to describe the highly selective clinical characteristics of Soumerai et al.'s subsample. [26][27][28][29][30][31][32][33][34][35] Omission of the subsampling criterion of at least 8 claims per year for at least 3 chronic medication classes is an especially important mistake because among those with less regular use (i.e., either less than 3 chronic medication classes or less regular use of 3 or more chronic classes), the risk of nursing home admission was not significantly related to the cap. 5 Thus, decision-makers relying on these publications for information would be given the erroneous impression that the increased nursing home risk applied to all Medicaid enrollees, to all elderly Medicaid enrollees, or to all elderly Medicaid enrollees with chronic disease. Such a misimpression limits the opportunity Accuracy in Pharmacoeconomic Literature Review: Lessons Learned From the Navajo Code Talkers • Aged 60 years or older • White race • No nursing home admissions during the 6 pre-implementation months • Filled a mean of 3 or more prescriptions per month (total of 36 claims) for any medications during baseline pre-implementation year • At least 1 prescription per quarter for any medications during baseline year • "Regular use" (at least 8 claims per year) of at least 1 "core" medication, defined as a drug in any of the following therapeutic classes-antianginal drugs, loop diuretics, antiarrhythmic agents, bronchodilators, inhaled steroids, insulin, anticoagulants, and anticonvulsants-during baseline year as banking that employs primarily young females. Descriptions should also indicate absolute rates rather than relying solely on relative rates. For example, the phrase "a 100% increase in risk" could represent a change either from 1% to 2% or from 25% to 50%, yet these ranges clearly have very different clinical and economic implications. Readers should also be given the temporal context for a study within clinically meaningful time frames; for example, they should be told if a study of routine clinical practice was conducted before or after the promulgation of key clinical guidelines or the development of an important technique that changed medical practice patterns.

Learning From the Navajo Code Talkers
Although pharmacoeconomic literature review is unlikely ever to approach the level of importance of the work of the Code Talkers, providing accurate information to managed care decision-makers is becoming increasingly critical as debate over health care policy proposals becomes progressively more intense. Consider what the outcome of WWII might have been, had the Navajo Code Talkers approached their work with the same complacency that characterizes too much of medical research literature review today. If those of us who promulgate information about health care evidence can achieve even a fraction of the dedication, attention to detail, and accuracy that of the decision-maker to target benefit design decisions to those unlikely to be harmed by them. In that regard, it is notable that the subsample (n = approximately 197 [48% of 411] enrollees) in which the nursing home admission rate was elevated represented less than 2% of the 10,734 continuously enrolled New Hampshire Medicaid recipients affected by the cap. 6 We Can Do Better: Literature Review That Is Both Accurate and Concise A suggested checklist for literature review descriptions is shown in Table 3. Although not intended to be comprehensive, the checklist presents the core basic elements that should be included in a description of previously published research. As the examples demonstrate, descriptions do not have to be lengthy, and it rarely takes many more words to describe a study adequately than to describe it inadequately. The key to informative literature review is specificity. Descriptions should be sufficiently complete to give managed care decision-makers a sense of the degree to which the study group represents the population about which decisions must be made. For example, a study sample consisting primarily of males treated at Department of Veterans Affairs (VA) clinics for congestive heart failure would appear to have limited applicability to a commercially insured population in a service industry such Accuracy in Pharmacoeconomic Literature Review: Lessons Learned From the Navajo Code Talkers  Morreim, 1998 21 No No No "Tightly constrained drug formularies may save short-term pharmacy costs, but they can raise rates of hospitalization and emergency room use because some patients experience greater side-effects and adherence problems from older, cheaper, or generic drugs that are not quite equivalent to their newer counterparts." • The New Hampshire cap was not a formulary.
• Relationship between the cap and hospitalization risk was not statistically significant. Powe, 1993 22 No characterized the work of the Code Talkers, we will be much closer to informed decision-making that will deliver highquality and cost-effective care.